Learning to Distribute Queries into Web Search Nodes

نویسندگان

  • Marcelo Mendoza
  • Mauricio Marín
  • Flavio Ferrarotti
  • Barbara Poblete
چکیده

Web search engines are composed of a large set of search nodes and a broker machine that feeds them with queries. A location cache keeps minimal information in the broker to register the search nodes capable of producing the top-N results for frequent queries. In this paper we show that it is possible to use the location cache as a training dataset for a standard machine learning algorithm and build a predictive model of the search nodes expected to produce the best approximated results for queries. This can be used to prevent the broker from sending queries to all search nodes under situations of sudden peaks in query traffic and, as a result, avoid search node saturation. This paper proposes a logistic regression model to quickly predict the most pertinent search nodes for a given query.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type

Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...

متن کامل

Distributed Web Search as a Stochastic Game

Distributed search systems are an emerging phenomenon in Web search, in which independent topic-specific search engines provide search services, and metasearchers distribute user’s queries to only the most suitable search engines. Previous research has investigated methods for engine selection and merging of search results (i.e. performance improvements from the user’s perspective). We focus in...

متن کامل

Discovery of Environmental Nodes in the Web

Analysis and processing of environmental information is considered of utmost importance for humanity. This article addresses the problem of discovery of web resources that provide environmental measurements. Towards the solution of this domain-specific search problem, we combine state-of-the-art search techniques together with advanced textual processing and supervised machine learning. Specifi...

متن کامل

Social Network Analysis of yahoo web-search engine query logs

Web is now the undisputed warehouse for information. It can now provide most of the answers for modern problems. Search engines do a great job by combining and ranking the best results when the users try to search for any particular information. However, as we know 'with great power comes great responsibility', it is not an easy task for data analysts to find the most relevant information for t...

متن کامل

Optimising Performance of Competing Search Engines in Heterogeneous Web Environments

Distributed heterogeneous search environments are an emerging phenomenon in Web search, in which topic-specific search engines provide search services, and metasearchers distribute user’s queries to only the most suitable search engines. Previous research has explored the performance of such environments from the user’s perspective (e.g., improved quality of search results). We focus instead on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010